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Pan, tilt and zoom sensors (150) are coupled to a broadcast 
camera (140) in order to determine the field of view of the broadcast 
camera and to make a rough estimate of a target's location in the 
broadcast camera's field of view. Pattern recognition techniques can 
be used to determine the exact location of the target in the broadcast 
camera's field of view. If a preselected target is at least partially within 
the field of view of the broadcast camera, all or part of the target's 
image is enhanced. The enhancements include replacing the target 
image with a second image, overlaying the target image or highlighting 
the target image. Examples of a target include a billboard, a portion 
of a playing field (Fig. 1) or another location at a live event. The 
enhancements made to the target's image can be seen by the television 
viewer but are not visible to persons at the live event. 
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A METHOD AND APPARATUS FOR ENHANCING THE BROADCAST 

OF A LIVE EVENT 



5 B ACKGROUND OF THE INVENTION 

Field of the Invention 

The present invention is directed to a method and apparatus for 
enhancing a television broadcast of a live event. 

10 Description of the Related Art 

The television presentation of live events could be improved by 
enhancing the video in real time to make the presentation more interesting to the 
viewer. For example, television viewers cannot see the entire playing field 
during a sporting event; therefore, the viewer may lose perspective as to where 

15 one of the players or objects are on the field in relation to the rest of the field, 
players or objects. During the telecast of football games cameras tend to zoom 
in on the players which allows the viewer to only see a small portion of the 
field. Because the viewer can only see a small portion of the field a viewer may 
not know where a particular player is in relation to the pertinent locations on the 

20 field. One instance is when a player is carrying the football, the television 
viewer may not know how far that player has to run for a first down. One 
enhancement that would be helpful to television viewers of football games is to 
highlight the field at the point where a player must advance in order to obtain 
a first down. 

25 An enhancement that would be helpful to viewers of golf tournaments is 

to highlight those portions of a golf course that have been notorious trouble spots 
to golfers. While the professional golfer is aware of these trouble spots and hits 
the ball to avoid those spots, the television viewer may not be aware of those 
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trouble spots and may wonder why a particular golfer is hitting the ball in a 
certain direction. If the golf course was highlighted to show these trouble spots, 
a television viewer would understand the strategy that the golfer is using and get 
more enjoyment out of viewing the golf tournament. Another useful 
5 enhancement would include showing the contours of the green. Similar 
enhancements to the playing field would be useful in other sports as well. 

Furthermore, live events do not take advantage of the scope of the 
television audience with respect to advertising. First, advertisements on display 
at a stadium can be televised; however, many of those advertisements are not 

10 applicable to the television audience. For example, a particular sporting event 
may be played in San Francisco and televised around the world. A local store 
may pay for a billboard at the stadium. However, viewers in other parts of the 
United States or in other countries receiving the broadcast may not have access 
to that store and, thus, the broadcast of the advertisement is not effective. 

15 Second, some of the space at a stadium is not used because such use would 
interfere with the view of the players or the spectators at the stadium. However, 
using that space for advertisement would be very effective for the television 
audience. For example, the glass around the perimeter of a hockey rink would 
provide an effective place for advertisements to the television audience. 

20 However, such advertisements would block the view of spectators at the 
stadium. Third, some advertisements would be more effective if their exposure 
is limited to particular times when customers are thinking of that type of 
product. For example, an advertisement for an umbrella would be more 
effective while it was raining. 

25 Previous attempts to enhance the video presentation of live events have 

not been satisfactory. Some broadcasters superimpose advertisements on the 
screen; however, these advertisements tend to block the view of the event. 
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Another solution included digitizing a frame of video and using a 
computer with pattern recognition software to locate the target image to be 
replaced in the frame of video. When the target image is found, a replacement 
image is inserted in its place. The problem with this solution is that the software 
is too slow and cannot be effectively used in conjunction with a live event. Such 
systems are even slower when they account for occlusions. An occlusion is 
something that blocks the target. For example, if the target is a billboard on the 
boards around a hockey rink, one example of an occlusion is a player standing 
in front of the billboard. When that billboard is replaced, the new billboard 
image must be inserted into the video such that the player appears to be in front 
of the replacement billboard. 

SUMMARY OF THE INVENTION 

The present invention is directed to a system for enhancing the broadcast 
of a live event. A target, at a live event, is selected to be enhanced. Examples 
of targets include advertisements at a stadium, portions of the playing field (e.g. , 
football field, baseball field, soccer field, basketball court, etc.), locations at 
or near the stadium, or a monochrome background (e.g. for chroma-key) 
positioned at or near the stadium. The system of the present invention, roughly 
described, captures video using a camera, senses field of view data for that 
camera, determines a position and orientation of a video image of the target in 
the captured video and modifies the captured video by enhancing at least a 
portion of the video image of the target. Alternative embodiments of the present 
invention include determining the perspective of the video image of the target 
and/or preparing an occlusion for the video image of the target. 

One embodiment of the present invention includes one or more field of 
view sensors coupled to a camera such that the sensors can detect data from 
which the field of view of the camera can be determined. The field of view 
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sensors could include pan, tilt and/or zoom sensors. The system also includes 
a processor, a memory and a video modification unit. The memory stores a 
location of the target and, optionally, data representing at least a portion of the 
video image of the target. The processor, which is in communication with the 
5 memory and the field of view sensors, is programmed to determine whether the 
target is within the field of view of the camera and, if so, the position of the 
target within a frame of video of the camera. Alternate embodiments allow for 
the processor to determine the position of the target in the frame of video using 
field of view data, pattern (or image) recognition technology, electromagnetic 

10 signals and/or other appropriate means. One exemplar embodiment uses field of 
view data to find a rough location of the target and then uses pattern recognition 
to find the exact location. Such a combination of field of view data with pattern 
recognition technology provides for faster resolution of the target's location than 
using pattern recognition alone. 

15 The video modification unit, which is in communication with the 

processor, modifies the frame of video to enhance at least a portion of the video 
image of the target. That is, a target can be edited, highlighted, overlayed or 
replaced with a replacement image. For example, a video modification unit can 
be used to highlight a portion of a football field (or other playing field) or 

20 replace a first billboard in a stadium with a second billboard. Because the 
system can be configured to use pattern recognition technology and field of view 
sensors, the system can be used with multiple broadcast cameras simultaneously. 
Therefore, a producer of a live event is free to switch between the various 
broadcast cameras at the stadium and the television viewer will see the 

25 enhancement regardless of which camera is selected by the producer. 

An alternate embodiment contemplates replacing either the field of view 
sensors and/or the pattern recognition technology with electromagnetic 
transmitters and sensors. That is, the target can be used to emit an 
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electromagnetic signal. A sensor can be placed at the camera, or the camera can 
be used as a sensor, to detect the signal from the target in order to locate the 
target. Once the target is located within the video frame, the system can 
enhance the video image of the target. A further alternative includes treating the 
5 target with spectral coatings so that the target will reflect (or emit) a distinct 
signal which can be detected by a camera with a filter or other sensor. 

These and other objects and advantages of the invention will appear more 
clearly from the following description in which the preferred embodiment of the 
invention has been set forth in conjunction with the drawings. 

10 

BRIEF DESCRIPTION OF THK DRAWTM^ 
Figure 1 depicts a perspective view of part of a football stadium. 
Figure 2 depicts a perspective view of the football stadium of Figure 1 
as seen by a television viewer after the video has been enhanced. 
15 Figure 3 depicts a block diagram of a subset of the components that make 

up the present invention. 

Figure 4 depicts a block diagram of a subset of the components that make 
up the present invention. 

Figure 5 is a flow chart describing the operation of the present invention. 
20 Figure 6 is a flow chart which provides more detail of how the present 

invention accounts for occlusions. 

Figure 7 is a partial block diagram of an alternate embodiment of the 
present invention. 

Figure 8 is a partial flow chart describing the operation of the alternate 
25 embodiment depicted in Figure 7. 
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DET AILED DESCRIPTION 
Figure 1 is a partial view of football stadium 100. In the center of 
stadium 100 is a football field 102. Surrounding football field 102 are the seats 
104 for the fans. Between seats 104 and playing field 102 is a retaining wall 
5 106. On retaining wall 106 is an advertisement AD1. For example purposes 
only, assume that a particular television broadcaster has selected four targets for 
enhancement. The first target is an advertisement AD1 to be replaced by 
another advertisement. The second target is a portion of the playing field which 
is to receive an advertisement. For this example, assume that the broadcaster 

10 wishes to place an advertisement in the end zone 108 of the football field. A 
third target is an area above the stadium. That is, the television broadcaster may 
wish that when a camera is pointed to the top of the stadium, the viewers sees 
an advertisement suspended above the stadium. A fourth target is a location on 
the playing field 102 representing where a team must cross in order to get a first 

15 down. Although the television broadcaster may be enhancing the video image 
as discussed above, the spectators and players at the stadium would not see any 
of these enhancements, rather they would view the stadium as depicted in 
Figure 1. 

Figure 2 shows the view of Figure 1 , as seen by viewers watching the 
20 broadcast on television, after enhancements are made to the video. 
Advertisement AD2 is in the same location as advertisement AD1 was in 
Figure 1. Thus, advertisement AD2 has replaced advertisement AD1. 
Advertisement AD3 is shown in end zone 108. Advertisement AD3 does not 
replace another advertisement because there was no advertisement in end zone 
25 108 prior to the enhancement. Figure 2 also shows advertisement AD4, which 
to the television viewer appears to be suspended above stadium 100. Also 
shown in Figure 2 is a thick line 110 which represents the highlighting of the 
portion of the field at which the team who is offense must cross in order to get 
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a first down at a particular moment during the game. In this particular example, 
the highlighting of the field consists of a bold thick line. Alternatives include 
different color lines, shading, using a blinking line, varying the brightness, etc. 
The enhancement need not be a line. The enhancement may also be any other 
shape or graphic that is appropriate. Thus, for purposes of this patent an 
enhancement includes editing an image, replacing part of an image with another 
image, overlaying all or part of an image, highlighting an image using any 
appropriate method of highlighting, or replacing an image with video. 

Figure 3 is a block diagram of a subset of the components that make up 
the present invention. The components shown on Figure 3 are typically located 
at a camera bay in the stadium; however, they can be located in other suitable 
locations. Broadcast camera 140 captures a frame of video which is sent to a 
production center as shown by the signal BC1. Broadcast camera 140 has a 
zoom lens, including a 2X Expander (range extender). Connected to broadcast 
camera 140 is a 2X Expander/zoom/focus sensor 152 (collectively a "zoom 
sensor") which senses the zoom in the camera, the focal distance of the camera 
lense, and whether the 2X Expander is being used. The analog output of sensor 
152 is sent to an analog to digital converter 154, which converts the analog 
signal to a digital signal, and transmits the digital signal to processor 156. One 
alternative includes using a zoom sensor with a digital output, which would 
remove the need for analog to digital converter 154. Broadcast camera 140 is 
mounted on tripod 144 which includes pan and tilt heads that enable broadcast 
camera 140 to pan and tilt. Attached to tripod 144 are pan sensor 146 and tilt 
sensor 148, both of which are connected to pan-tilt electronics 150. 
Alternatively, broadcast camera 140 can include a built-in pan and tilt unit. In 
either configuration, pan sensor 146, tilt sensor 148 and zoom sensor 152 are 
considered to be coupled to broadcast camera 140 because they can sense data 
representing the pan tilt, and zoom of broadcast camera 140. 
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Processor 156 is an Intel Pentium processor with supporting electronics; 
however, various other processors can be substituted. Processor 156 also 
includes memory and a disk drive to store data and software. In addition to 
being in communication with pan-tilt electronics 150 and analog to digital 
5 converter 154, processor 156 is in communication (via signal CB1) with a 
production center which is described below. 

In one embodiment, pan sensor 146 and tilt sensor 148 are optical 
encoders that output a signal, measured as a number of clicks, indicating the 
rotation of a shaft. Forty thousand (40,000) clicks represent a full 360° 

10 rotation. Thus, a processor can divide the number of measured clicks by 40,000 
and multiply by 360 to determine the pan or tilt angle in degrees. The pan and 
tilt sensors use standard technology known in the art and can be replaced by 
other suitable pan and tilt sensors known by those skilled in the relevant art. 
Pan/tilt electronics 150 receives the output of pan sensor 146 and tilt sensor 148, 

15 converts the output to a digital signal (representing pan and tilt) and transmits 
the digital signal to processor 156. The pan, tilt and zoom sensors are used to 
determine the field of view of the broadcast camera. Thus, one or more of the 
pan, tilt or zoom sensors can be labeled as a field of view senor(s). For 
example, if a camera cannot zoom or tilt, the field of view sensor would only 

20 include a pan sensor. 

An alternative field of view sensor includes placing marks in various 
known locations in the stadium such that each mark looks different and at least 
one mark will always be visible to the camera while the camera is pointed at the 
relevant portions of the stadium. A computer using pattern recognition 

25 technology can find the mark in a frame of video and, based on the mark's size 
and position in the frame of video, determine more precisely the field of view 
and/or pan, tilt or zoom of the camera. A system can also be set up to use 
pan/tilt/zoom sensors in combination with the marks described above so that the 
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pan/tilt/zoom can be used to make a rough estimate of where the camera is 
pointing and the mark is used to achieve a more accurate estimate. In such a 
combination system the marks need not look different if the placement of the 
marks is predetermined. Another alternative includes placing infrared emitters 
5 or beacons along the perimeter of the playing field or other portions of the 
stadium. A computer can determine an infrared sensor's field of view based on 
the location of the signal in the infrared sensor 1 s frame of data. If the infrared 
sensor is mounted on a broadcast camera, determining the pan and tilt of the 
infrared sensor determines the pan and tilt of the broadcast camera plus a known 

10 offset. A more detailed discussion of using infrared technology, pan/tilt/zoom 
sensors, three dimensional location finding technology and video enhancement 
can be found in U.S. Patent Application No. 08/585,145, A System For 
Enhancing The Television Presentation Of An Object At A Sporting Event, 
incorporated herein by reference. 

15 Figure 3 shows a second and optional camera labeled as dedicated camera 

142. Dedicated camera 142 is mounted on a tripod 157. In one embodiment, 
tripod 157 includes an optional pan sensor 158 and an optional tilt sensor 160, 
both of which are in communication with pan-tilt electronics 150. As will be 
explained below, in one embodiment the dedicated camera is set to one pan and 

20 tilt position; therefore, pan and tilt sensors are not needed. The output of 
dedicated camera 142 is the camera signal DC1, which is communicated to the 
production center described below. The present invention will perform its 
function without the use of dedicated camera 142; however, dedicated camera 
142 improves the ability of the system to account for occlusions. Dedicated 

25 camera 142 should be located substantially adjacent to broadcast camera 140. 
That means that dedicated camera 142 should be as close as possible to broadcast 
camera 140 so that both will function properly yet their optical axes will be as 
close as practical. Thus, if both cameras are focused on the same object, their 
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pan and tilt angle should be very similar. In various alternatives, each broadcast 
camera could be associated with more than one dedicated cameras. In order to 
further enhance performance, each broadcast camera would include a plurality 
of dedicated cameras, one dedicated camera for each potential target the 
5 broadcast camera will view. 

Figure 4 is a block diagram of the production center. Typically, the 
production center is housed in a truck parked outside of the stadium. However, 
the production center can be at a central office or the components of the 
production center can be spread out in multiple locations. The heart of the 
10 production center is processor 200. The preferred processor 200 is an Onyx 
computer from Silicon Graphics; however, various other suitable processors or 
combinations of processors can perform the necessary functions of the present 
invention. Processor 200 is in communication with video control 202, video 
mixer 204 and multiplexor 206. In one alternative, processor 200 includes more 
15 than one processor. For example, processor 200 could include two Onyx 
computers, one for locating the target and one for determining occlusions. 

Broadcasters use many broadcast cameras at the stadium to televise a 
sporting event. The video signals from the various cameras are sent to video 
control 202 which is used to select one broadcast camera for transmission to 
20 viewers. One embodiment of video control 202 includes a plurality of monitors 
(one monitor for each video signal) and a selection circuit. A director (or 
manager, producer, etc.) can monitor the different video signals and choose 
which signals to broadcast. The choice would be communicated to the selection 
circuit which selects one camera signal to broadcast. The choice is also 
25 communicated to processor 200, video mixer 204 and multiplexer 206 via signal 
208. The selected video signal is sent to delay 210 and processor 200 via analog 
to digital converter 212. If the broadcast camera is a digital camera, then there 
would be no need for analog to digital converter 212. 
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The output of delay 210 is sent to video modification unit 214. The 
purpose of delay 210 is to delay the broadcast video signal a fixed number of 
frames to allow time for processor 200 to receive data, determine the position 
of the target in the frame of video and prepare any enhancements. Although the 
5 video is delayed a small number of frames, the television signal is still defined 
as live. The delay introduced by the system is a small delay (under one second) 
which does not accumulate. That is, different frames of video are enhanced with 
the same small delay. For example, a ten frame delay is equivalent to one-third 
of a second, which is not considered a significant delay for television. 

10 Video mixer 204 receives the video signals from all of the dedicated 

cameras. Figure 4 shows signals DC1 and DC2. Signal DC1 is a dedicated 
camera associated with the broadcast camera BC1 . If video control 202 selects 
BC1 then that selection is communicated to video mixer 204 which selects DC1. 
As discussed above, it is contemplated that some alternatives include having 

15 many dedicated cameras for one broadcast camera. For example, one broadcast 
camera may have four dedicated cameras. In that case, the dedicated cameras 
would be labeled DCla, DClb, DClc and DCld. When broadcast camera BC1 
is selected, video mixer 204 would select up to all four dedicated cameras: 
DCla, DClb, DClc and DCld. The selected signal(s) from video mixer 204 

20 is sent to analog to digital converter 216 which digitizes the video signal(s) and 
sends the digital signal(s) to processor 200. 

Multiplexer 206 receives signals from the processors at each of the 
camera locations. For example, Figure 4 shows multiplexer 206 receiving signal 
CB1 from processor 156 of Figure 3. Each of the processor signals (CB1 , CB2, 

25 . . . ) is associated with a broadcast camera. Thus, the selection by video 
control 202 is communicated to multiplexer 206 so that multiplexer 206 can send 
the corresponding signal to processor 200. The signal sent by multiplexer 206 
to processor 200 includes the information from the field of view sensors. In one 
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embodiment, processor 156 calculates the field of view and sends the resulting 
information, via multiplexer 206, to processor 200. In another embodiment, 
processor 200 receives the data via multiplexer 206 and determines the field of 
view. Either alternative is suitable for the present invention. 
5 Processor 200 is connected to memory 220 which stores the locations of 

the targets and images of the targets (or at least partial images). Memory 220 
also stores images of the replacement graphics, instructions for creating 
replacement graphics and/or instructions for highlighting, editing, etc. Memory 
200 is loaded with its data and maintained by processor 222. The inventors 
10 contemplate that during operation of this system, processor 200 will be too busy 
to use compute time for loading and maintaining memory 220. Thus, a separate 
processor 222 is used to load and maintain the memory during operation. If cost 
is a factor, processor 222 can be eliminated and processor 200 will be used to 
load and maintain memory 220; however, for optimal performance memory 220 
15 should be loaded, if possible, prior to the broadcast. 

The images and locations of targets can be loaded into memory 220 
either manually or automatically. For example, if the target's image and 
location are known in advance (e.g. an advertisement at the stadium) then prior 
to real-time operation of the system an operator can input the location of the 
20 target and scan in (or otherwise download) an image of the target. 
Alternatively, the operator can point one or more cameras at the target and use 
a mouse, light pen or other pointing device to select the target's image for 
storing in memory 220. The location of the target can be determined by 
physical measurement, using pan/tilt/zoom sensors, etc. If the target is not 
25 known in advance (for example if the target is the first down yard line) then the 
operator can select the target during operation using a pointing device and the 
system will download the image of the target and its location (using 
pan/tilt/zoom data) to memory 220. Alternatively, the system can be 
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programmed to know that the target is one of a set of possible targets. For 
example, the system can be programmed to know that the target is a yard line 
and the operator need only input which yard line is the current target. The 
replacement graphics are loaded into memory after being digitized, downloaded 
5 or the replacement graphics can be created with processor 222. Instructions for 
highlighting or creating replacement graphics can be programmed using 
processor 222 or processor 200. 

Processor 200 is connected to video modification unit 214. The output 
of video modification unit 214, labeled as signal 226, is the video signal 

10 intended for broadcast. This signal can be directly broadcast or sent to other 
hardware for further modification or recording. Video modification unit 214 
modifies the video signal from delay 210 with the data/signal from processor 
200. The type of modification can vary depending on the desired graphic result. 
One exemplar implementation uses a linear keyer as a video modification unit 

15 214. When using a keyer, the signal from the video processor 200 to the keyer 
includes two signals: YUV and an external key (alpha). The YUV signal is 
called foreground and the signal from delay 210 is called background. Based on 
the level of the external key, the keyer determines how much of the foreground 
and background to mix to determine the output signal, from 100 percent 

20 foreground and zero percent background to zero percent foreground and 100 
percent background, on a pixel by pixel basis. Alternatively, video modification 
unit 214 can be another processor or video modification unit 214 can be a part 
of processor 200. 

In operation, processor 200 determines the field of view of the selected 
25 broadcast camera and checks memory 220 to see if any targets are within that 
field of view. If so, processor 200 then determines the exact position of the 
target in a frame of video by determining which pixels represent the target. 
Processor 200 then checks memory 220 for the replacement graphic or 
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instructions to make a replacement graphic (or highlight). If the replacement 
strategy is to highlight a certain portion of a field, then memory 220 may include 
instructions for changing the color of a certain portion of the field, shading of 
a certain portion of the field, etc. Based on the pan, tilt and zoom, and the 

5 actual image of the target, processor 200 determines the size and orientation of 
the replacement graphic (also called mapping). In one embodiment, the 
enhancement includes processor 200 creating a frame of video with a graphic at 
the position of the enhancement. The frame created by processor 200 is sent to 
video modification unit 214 which combines the frame from processor 200 with 

10 the frame from delay 210. As will be described below, processor 200 is also 
used to account for occlusions. An alternate embodiment includes eliminating 
the separate video modification unit and using processor 200 to edit the video 
signal from the selected broadcast camera. 

Figure 5 is a flow chart which explains the operation of the present 

15 invention. In step 300, video data is captured by a broadcast camera and is 
digitized. If the broadcast camera is a digital camera, digitizing is unnecessary. 
Simultaneously with step 300, pan, tilt and zoom data (field of view data) is 
sensed in step 302 and the field of view is determined in step 304. In step 306, 
processor 200 determines if any of the targets are within the field of view. 

20 Memory 200 (depicted in Figure 4) includes a database. In one alternative, the 
database stores the three dimensional locations of all the targets. The field of 
view of a broadcast camera can be thought of as a pyramid whose location and 
dimensions are determined based on the field of view data. After determining 
the dimensions and locations of the pyramid, processor 200 accesses memory 

25 220 to determine if any of the targets are within the pyramid. Step 306 is a 
quick method for determining if there is a target within the field of view of the 
camera. If not, the process is done and the system waits until the next frame of 
data. If there is a target within the field of view of the selected broadcast 
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camera, then the exact position of the target must be determined within the 
frame of video of the selected broadcast camera. 

Preferably, determining the position of the target is a two-step process. 
In the first step ( step 308) a rough estimate is made based on the pan, tilt and 
5 zoom values and in the second step the estimate of the target's position is refined 
(step 310). In regard to step 308, by knowing where the camera is pointed and 
the target's three dimensional location, the target's position in the video frame 
can be estimated. The accuracy of step 308 is determined by the accuracy of the 
pan/tilt/zoom sensors, the software used to determine the field of view and the 

10 stability of the platform on which the camera is located. In some alternatives, 
the field of view sensor equipment may be so accurate that the position of the 
target is adequately determined and step 310 is not necessary. In other 
instances, the pan, tilt and zoom data only provides a rough estimate 308 (e.g 
a range of positions or general area of position) and step 310 is needed to 

15 determine a more accurate position. 

Step 310 provides a more accurate determination of the target's position 
using pattern recognition techniques which are known in the art. Example of 
known pattern recognition and image processing technology can be found in the 
following documents: U.S. Patent No. 3,973,239, Pattern Preliminary 

20 Processing System; U.S. Patent No. 4,612,666, Automatic Pattern Recognition 
Apparatus; U.S. Patent No. 4,674,125, Real-Time Hierarchal Pyramid Signal 
Processing Apparatus; U.S. Patent No. 4,817,171, Pattern Recognition System; 
U.S. Patent No. 4,924,507, Real-Time Optical Multiple Object Recognition and 
Tracking System and Method; U.S. Patent No. 4,950,050, Optical Target 

25 Recognition System; U.S. Patent No. 4,995,090, Optoelectronic Pattern 
Comparison System; U.S. Patent No. 5,060,282, Optical Pattern Recognition 
Architecture Implementing The Mean-Square Error Correlation Algorithm; U.S. 
Patent No. 5,142,590, Pattern Recognition System; U.S. Patent No. 5,241,616, 
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Optical Pattern Recognition System Utilizing Resonator Array; U.S. Patent No. 
5,274,716, Optical Pattern Recognition Apparatus; U.S. Patent No. 5,465,308, 
Pattern Recognition System; U.S. Patent No. 5,469,512, Pattern Recognition 
Device; and U.S. Patent No. 5,524,065, Method and Apparatus For Pattern 

5 Recognition. It is contemplated that step 310 can use suitable technology other 
than pattern recognition technology. 

In step 312, processor 200 fetches the replacement graphic from memory 
220. If memory 220 is storing instructions for replacement graphics, then 
processor 200 fetches the instructions and creates the graphic. For example, 

10 creating the graphic can include drawing a highlight for the yard line of a 
football field. In step 314, processor 200 determines the size and orientation of 
the replacement image, and maps the replacement image to the video frame. 
Memory 220 merely stores one size image. Because of the pan, tilt and zoom 
of the broadcast camera, the image stored in memory 220 may need to be 

15 mapped to the video frame (e.g. magnified, reduced, twisted, angled, etc.). 
Processor 200 can determine the orientation based on the field of view data 
and/or the pattern recognition analysis in step 310. For example, by knowing 
where the broadcast camera is located and the pan, tilt and zoom of the 
broadcast camera, a computer can be programmed to figure how to map the 

20 replacement image or highlight on to the video frame. 

In step 316, the system accounts for occlusions. If there is an object or 
person in front of the target, then the enhanced video should show the object or 
person in front of the replacement graphic, highlight, etc. In one embodiment, 
the system cuts out a silhouette in the shape of the object or person from the 

25 replacement image. Step 316 is discussed in more detail with respect to 
Figure 6. 

In step 318, the system modifies the video of the original broadcast 
camera. As discussed above, this could include creating a second frame of video 
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which includes a replacement image and using a keyer to combine the second 
frame of video with the original frame of video. Alternatively, a processor can 
be used to edit the frame of video of the broadcast camera. It is possible that 
within a given frame of video there may be more than one target. In that case 
5 steps 308-318 may be repeated for each target, or steps 308-316 may be repeated 
for each target and step 318 be performed only once for all targets. Subsequent 
to step 318, the enhanced frame of video may be broadcast or stored, and the 
process (steps 300-318) may repeat for another frame of video. 

Figure 6 is a more detailed flow diagram explaining how the system 

10 accounts for occlusion. The steps described in Figure 6 are performed by a 
system which includes one or more dedicated cameras (e.g. dedicated camera 
142). Step 350, is performed before the live event occurs. In one embodiment, 
there is a dedicated camera substantially adjacent to a broadcast camera for each 
target that the broadcast camera may view. For example, if there are three 

15 advertisements which are to be replaced in the stadium and a particular camera 
can view two of those advertisements, then the system can include two dedicated 
cameras substantially adjacent to that particular camera. Prior to the game, a 
dedicated camera is pointed directly at one of the targets; the camera is zoomed 
in such that the target fills a substantial portion of the dedicated camera's frame 

20 of video; and the image of the target is stored in memory 220. A substantial 
portion means that the target typically appears to cover over half of the frame 
of video of the dedicated camera. For optimal results, the dedicated camera 
should be zoomed in such that the target fills the greatest amount of the frame 
of video possible while remaining completely within the frame of video, unless 

25 it is desired to have clues of the scenery surrounding the target. After the 
dedicated camera is pointed at the target, its pan, tilt and zoom should remain 
fixed. 
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Once the television broadcast of the live event begins, steps 352-362 are 
repeated for each frame where the occlusion analysis is desired. In step 352, a 
video image is captured and digitized by the dedicated camera. Simultaneously, 
a video image is captured by the broadcast camera. In step 354, the digitized 
5 image from the dedicated camera is compared to the stored image of the target. 
The stored image is stored in memory 220. The processor knows which stored 
image to compare with from step 306 of Figure 5. The step of comparing could 
include altering one of the images such that both images are the same size and 
orientation, and then subtracting the data. Alternatively, other methods can be 

10 used to compare. If there is an occlusion blocking the target (step 356), then the 
two images will be significantly different and, in step 358, an occlusion will be 
reported. In reporting the occlusion, the system reports the presence of an 
occlusion and the coordinates of the occlusion. When performing step 354, it 
is possible that there is no occlusion; however, the two images are not exactly 

15 the same. The differences between the images must meet a certain minimum 
threshold to be considered an occlusion. If the differences are not great enough 
to be an occlusion, then in step 360 the system determines that the differences 
are due to ambient conditions in the stadium. For example, if the lights have 
been dimmed then the captured image of the target may appear darker. Weather 

20 conditions could also have an effect on the appearance of the target image. If 
small differences are detected in step 360 that do not meet the threshold for 
occlusions, then the system "learns" the changes to the target by updating the 
stored image of the target to reflect the new lighting or weather conditions (step 
362). For example, the new stored image of the target may be darker than the 

25 original image. Subsequent to step 362 the system performs the report step 358 
and reports that no occlusion was found. 

An alternative to the method of Figure 6 includes comparing the target 
image from the broadcast camera to the stored image. However, using the 
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broadcast camera is not as advantageous as using a dedicated camera because it 
is likely that the broadcast camera would not be zoomed to the image. Thus, the 
target image is likely to be smaller on the broadcast camera than it will on the 
dedicated camera. Because there is a small image to work with, the system loses 
the subpixel accuracy obtained from the dedicated camera. Also, using a 
separate dedicated camera may increase the speed at which the system accounts 
for occlusions. 

Figure 7 shows an alternative embodiment of the present invention which 
utilizes electromagnetic transmitting beacons at or near a target. The beacons 
transmit an electromagnetic signal not visible to the human eye. 
Electromagnetic waves include light, radio, x-rays, gamma rays, microwave, 
infrared, ultraviolet and others, all involving the propagation of electric and 
magnetic fields through space. The difference between the various types of 
electromagnetic waves are in the frequency or wave length. The human eye is 
sensitive to electromagnetic radiation of wave lengths from approximately 400- 
700nm, the range called light, visible light or the visible spectrum. Thus, the 
phrase "electromagnetic signal not visible to a human eye" means an 
electromagnetic wave outside of the visible spectrum. It is important that the 
signal transmitted from the beacon is not visible to human eye so that the visual 
appearance of the target will not be altered to those people attending the live 
event. In one embodiment, the beacon is an electromagnetic transmitter which 
includes infrared emitting diodes. Other sources which transmit electromagnetic 
waves may also used, for example, radio transmitters, radar repeaters, etc. 

Figure 7 shows a broadcast camera 400 which outputs a video signal 402. 
Broadcast camera 400 includes a zoom lens coupled to a zoom detector 404. 
The output of zoom detector 404 is transmitted to analog to digital converter 406 
which sends the digital output to processor 408. Mounted on top of broadcast 
camera 400 is sensor 410. In the embodiment which utilizes an infrared emitter 
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as a beacon, sensor 410 is an infrared sensor. Sensor 410 is mounted on top of 
broadcast camera 400 so that the optical axis of sensor 410 is as close as possible 
to the optical axis of broadcast camera 400. It is also possible to locate sensor 
410 near broadcast camera 400 and account for differences between optical axes 

5 using matrix transformations or other suitable mathematics. 

One example of an infrared sensor is a progressive scan, full frame 
shutter camera, for example, the TM-9701 by Pulnix. The Pulnix sensor is a 
high resolution 768(H) by 484(V) black and white full frame shutter camera with 
asynchronous reset capability. The camera has an eight bit digital signal output 

10 and progressively scans 525 lines of video data. A narrow band infrared filter 
is affixed in front of the lens of the Pulnix sensor. The purpose of the filter is 
to block electromagnetic signals that are outside the spectrum of the signal from 
the beacon. The sensor captures a frame of video (data) which comprises a set 
of pixels. Each pixel is assigned a coordinate corresponding to an x-axis and a 

15 y-axis. The sensor data includes an eight bit brightness value for each pixel, 
which are scanned out pixel by pixel to interface 412 along with other timing 
information. Interface 412 outputs four signals: LDV, FDV, CK and DATA. 
LDV (line data valid) is transmitted to X-Y counters 414 and indicates that a 
new line of valid data is being scanned out of sensor 410. FDV (frame data 

20 valid) which is transmitted to X-Y counters 414 and memory control 416, 
indicates that valid data for the next frame is being transmitted. CK (pixel 
clock) is a 14.318 MHZ clock from sensor 414 sent to X-Y counters 414 and 
memory control 416. X-Y counters 414 counts X and Y coordinates 
sequentially in order to keep track of the location of the pixel whose data is 

25 being scanned in at the current time. When LDV is inserted, the X counter is 
reset. When FDV is inserted, the Y counter is reset. 

The signal Data includes the eight bit data value for each pixel. As data 
is read from sensor 410, memory control 416 determines whether the pixels 
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meets a brightness threshold. That is, noise and other sources will cause a large 
number of pixels to receive some data. However, the pixels receiving the signal 
from the beacon will have at least a minimum brightness level. This brightness 
threshold is set in a register (not shown) which can be set by processor 408. If 
5 the data for a particular pixel is above the brightness threshold, memory control 
416 sends a write enable (WE) signal to memory 418, causing memory 418 to 
store the X and Y coordinates of the pixel, the data for that pixel and a code for 
that pixel. The code indicates that the data is valid data, a new frame, end of 
frame or a flash. Processor 408 can read the data from memory 418 and process 
10 the data locally or transmit the data to the production center (e.g. , to multiplexer 
206). 

Many arenas do not allow photographers to use flashes on their cameras 
in order to prevent impairing a player's vision from random flashes during a 
sporting event. In lieu of individual camera flashes, many arenas install a set of 

15 strobe flashes at or near the ceiling of the arenas and provide for communication 
between each photographer's camera and the set of strobe flashes. When the 
photographer takes a picture, the strobe flashes emit a flash of light, which may 
include an electromagnetic wave in the infrared spectrum. In one embodiment, 
the system avoids using incorrect data due to sensors detecting a flash by using 

20 filters. A second embodiment connects a signal from a strobe flash to a 
computer which causes the system to ignore data sensed during a flash. A third 
embodiment includes using flash detectors. The flash detector can be located 
anywhere in the arena suitable for sensing a strobe flash. Figure 7 shows flash 
detector 422 which detects a flash and sends a signal to memory control 416. 

25 Flash detector 422 includes a photo detector which can comprise, at least, a 
photo diode and an opamp. In front of the photo detector would be a filter that 
allows detection of signals in a spectrum that includes the signals emitted by the 
beacon. Connected to the opamp are components which can detect pulse edges. 
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The embodiment described in Figure 7 operates similar to the 
embodiment described in Figure 3. Some of the differences between the 
operation of the two embodiments are depicted in Figure 8. Similar to the 
embodiment in Figure 3, the embodiment in Figure 7 first captures and digitizes 
5 video data. In step 450, infrared data is received. In step 452, the system 
determines whether a target is found in the infrared data by monitoring the data 
stored in memory 418. Since memory control 416 only allows data above a 
threshold to be stored in memory 418, if a given frame of data from a sensor has 
pixel data stored in memory then a target is found. If a sensor is detecting false 

10 targets, then various error correction methods known in the art can be utilized. 
In step 454, the position of the target is determined in the frame of video by 
reading the X and Y coordinates stored with the pixel data in memory 418. Step 
456 fine tunes the determined position information of the target to account for 
the error from the camera's platform or pan/tilt/ zoom sensors. One alternative 

15 for accounting for the difference in optical axis is to use a transformation matrix; 
however, other mathematical solutions known in the art are also suitable. After 
step 456, the system can perform steps 312 through 318 as described with 
respect to Figure 5, however, any field of view data used is based on the size 
and position of the beacon 1 s signal in the sensor's frame of video. 

20 A further alternative of Figure 7 includes using polarization. That is the 

infrared filter on sensor 410 is replaced or augmented with a polarized filter. 
A target to be replaced (e.g., a billboard) is treated with a spectral coating that 
allows only polarized light to reflect off the billboard. The filter and spectral 
coating are designed such that light reflecting off the billboard to sensor 410 will 

25 be completely blacked-out. The pixels that represent the position of the target in 
the sensor's frame of video will have a brightness value of zero or close to zero. 
Thus, memory control 416 is used to only store memory that has a brightness 
value of zero or below a threshold level. 
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The foregoing detailed description of the invention has been presented 
for purposes of illustration and description. It is not intended to be exhaustive 
or to limit the invention to the precise form disclosed, and obviously many 
modifications and variations are possible in light of the above teaching. The 
described embodiments of the system for enhancing the broadcast of a live event 
were chosen in order to best explain the principles of the invention and its 
practical application to thereby enable others skilled in the art to best utilize the 
invention in various embodiments and with various modifications as are suited 
to the particular use contemplated. The invention is, thus, intended to be used 
with many different types of live events including various sporting events and 
nonsporting events. It is intended that the scope of the invention be defined by 
the claims appended hereto. 
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CIMMS 

What is claimed is: 



1 1 . A method for enhancing the broadcast of a live event, comprising 

2 the steps of: 

3 capturing first video using a first camera; 

4 sensing field of view data representing a field of view of said first 

5 camera; 

6 deteimining a position and orientation of a video image of a target in said 

7 captured video at least partially based on recognizing one or more portions of 

8 said video image of said target in said captured video; and 

9 modifying said captured video data by enhancing at least a segment of 
10 said video image of said target. 

1 2. The method according to claim 1, wherein said step of 

2 determining a position includes the steps of: 

3 determining a rough estimate of said position of said target in said 

4 captured video using said field of view data; and 

5 determining a more precise estimate of said position of said target in said 

6 captured video using a pattern recognition technique. 

1 3. The method according to claim 1, further including the step of: 

2 determining whether said target is within said field of view of said first 

3 camera. 

1 4. The method according to claim 1, wherein: 

2 the step of determining is also at least partially based on comparing said 

3 field of view data to prestored location data for said target. 
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1 5. The method according to claim 1, wherein: 

2 said step of modifying replaces a first advertisement with a second 

3 advertisement. 

1 6. The method according to claim 1 , wherein: 

2 said step of modifying replaces an image of a surface in a stadium with 

3 an advertisement. 

1 7. The method according to claim 1, wherein: 

2 said step of modifying includes highlighting a portion of a playing field. 

1 8. The method according to claim 1, wherein: 

2 enhancing said video image of said target does not include 

3 replacing said video image of said target; and 

4 said method further including the step of accounting for 

5 occlusions. 

1 9. The method according to claim 1 , further including the steps of: 

2 capturing second video using a second camera, said second video 

3 including said target, said second camera zoomed such that said target 

4 substantially fills most of said second camera's field of view; 

5 detecting an occlusion of said target in said second video; and 

6 using said detection of said occlusion from said second video to 

7 determine where said occlusion is positioned in said first video; 

8 said step of modifying said first video does not replace said occlusion. 

1 10. The method according to claim 1 , further including the steps of: 
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storing said target's location before said step of capturing; and 
storing an unoccluded image of said target before said step of capturing. 

11. A method according to claim 1, further including the step of: 
learning changes to said target image. 

12. The method according to claim 1 , further including the steps of: 
comparing said video image of said target in said captured video with a 

video image stored in a memory; and 

updating said memory to include a revised image of said target. 

13. A method for enhancing the broadcast of a video image of a 
target at a live event, comprising the steps of: 

capturing a frame of video using a first camera; 

sensing an electromagnetic signal transmitted from said target, said 
electromagnetic signal not being visible to the human eye; 

determining a position and orientation of said video image of said target 
in said frame of video, at least partially based on said electromagnetic signal; 
and 

modifying said video data by enhancing at least a segment of said video 
image of said target. 

14. A method according to claim 13, wherein: 

said step of determining includes determining the pixel position of the 
target in said sensor frame of data. 

15. A method according to claim 13, wherein 
said electromagnetic signal is an infrared signal. 
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16. A method according to claim 13, further including the step of: 
storing data, based on said electromagnetic signal, that has a value 

greater than a predetermined threshold. 

17. A method according to claim 13, further including the step of: 
ignoring data from said electromagnetic signal if sensed during a flash. 

18. A method for enhancing the broadcast of a target at a live event, 
comprising the steps of: 

capturing a first frame of video using a first camera; 

capturing a second frame of video using a second camera, said second 
frame of video including said target; 

determining if said target is within said first frame of video; 

determining a position and orientation of said target in said first frame 
of video; 

detecting an occlusion of said target in said second frame of video; 

determining where said detected occlusion is positioned in said first 
frame of video at least partially based on said step of detecting; and 

modifying said first frame of video by enhancing said target in said first 
frame of video without enhancing said detected occlusion. 

19. A method according to claim 18, wherein: 

said second camera is pointed at said target and is located substantially 
adjacent said first camera; 

said step of detecting an occlusion includes comparing at least a portion 
of said second frame of video to an unoccluded image of said target. 
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1 20. A method according to claim 19, wherein: 

2 said second camera is zoomed such that said target fills a substantial 

3 portion of said second frame of video. 

1 21 . A method according to claim 19, further including the steps of: 

2 storing said unoccluded image of said target prior to said step of 

3 capturing said first frame of video; and 

4 updating said stored unoccluded image of said target if lighting 

5 conditions change. 

1 22. A system to be used with a first camera for enhancing the 

2 broadcast of a target at a live event, comprising: 

3 one or more field of view sensors coupled to said camera such that said 

4 one or more field of view sensors can detect field of view data representing said 

5 first camera's field of view; 

6 memory storing a location of said target; and 

7 one or more processors, in communication with said memory and said 

8 one or more field of view sensors, said one or more processors programmed to 

9 determine whether said target is within the field of view of said camera and to 

10 determine where said target is positioned within a frame of video of said first 

1 1 camera. 

1 23. A system according to claim 22, wherein: 

2 said memory stores data representing a video image of said replacement 

3 graphic. 

1 24. A system according to claim 22, further including: 
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a video modification unit, in communication with said one or more 
processors, for modifying said frame of video to enhance at least a section of 
said video image of said target with a replacement graphic. 

25. A system according to claim 24, wherein: 
said video modification unit is a linear keyer. 

26. A system according to claim 24, wherein: 
said video modification unit is a processor. 

27. A system according to claim 24, wherein: 

said video modification unit highlight a portion of a football field. 

28. A system according to claim 24, wherein: 

said video modification unit replaces a first billboard with a second 
billboard. 

29. A system according to claim 24, wherein: 

said video modification unit adds a first billboard to said frame of video. 

30 A system according to claim 22, wherein: 
said one or more field of view sensors includes a pan sensor, a tilt 
sensor and a zoom sensor. 

31 . A system according to claim 22, further including: 
a second camera pointed at said target, in communication with said one 
or more processors and located substantially adjacent to said first camera. 
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1 32. A system according to claim 22, further including: 

2 a video control in communication with said first camera and said one or 

3 more processors; 

4 a video mixer in communication with said second camera and said one 

5 or more processors; and 

6 a video delay unit in communication with said video control and said 

7 video modification unit. 

1 33. A system for enhancing the broadcast of target at a live event, 

2 comprising: 

3 a plurality of broadcast cameras; 

4 a plurality of field of view sensors, each sensor coupled to one of said 

5 broadcast cameras; 

6 a multiplexer in communication with said field of view sensors for 

7 selectively transmitting a signal from one of said field of view sensors; 

8 a video delay unit; 

9 a video control unit in communication with said broadcast cameras, said 

10 video control unit selectively transmits to said video delay unit a signal from one 

11 of said broadcast cameras; 

12 a plurality of dedicated cameras with a fixed field of view and pointed 

13 at one of said plurality of targets, each dedicated camera located substantially 

14 adjacent to a broadcast camera; 

15 a video mixer in communication with said video control unit and said 

16 dedicated camera for selectively transmitting a signal from one of said dedicated 

17 cameras, said selected one of said dedicated cameras being substantially adjacent 

18 to said selected one of said broadcast cameras; 

19 memory storing the location of said targets; 
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20 one or more processors, in communication with said memory and said 

21 multiplexer, said one or more processors receives said selected signal from said 

22 video control unit, said one or more processors programmed to determine 

23 whether one of said targets is positioned within the field of view of one of said 

24 broadcast cameras and to determine where said one target is within a frame of 

25 video of said one broadcast camera; 

26 a video modification unit, in communication with said one or more 

27 processors, for modifying said frame of video to enhance at least a section of 

28 said video image of said target with a replacement graphic. 

1 34. A system, to be used with a first camera, for enhancing the 

2 broadcast of a live event, comprising: 

3 a target including an electromagnetic transmitter; 

4 a sensor adapted to receive an electromagnetic signal from said target, 

5 said electromagnetic signal is not visible to a human eye; 

6 a memory storing the location of said target; 

7 one or more processors, in communication with said memory and said 

8 sensor, said one or more processors programmed to determine whether said 

9 target is within the field of view of said first camera and to determine where said 
10 target is within a frame of video of said first camera. 

1 35. A system according to claim 34, further including: 

2 a video modification unit, in communication with said one or more 

3 processors, for modifying said frame of video to replace at least a section of said 

4 video image of said target with at least a replacement graphic. 

1 36. A system according to claim 34, wherein: 

2 said electromagnetic signal is an infrared signal. 



BNSDOCID: <WO 981 8261 A1_l_> 




BNSDOCID: <WO 9818261A1_I_> 



WO 98/18261 



« 



PCT/US97/16878 



2/6 



BC1 




DC1 



144 



142 



Dedicated 
Camera 



158 




160 



-157 



150 



r-{^162 



Pan-tilt 
Electronics 



Processor 



CB1 



154 

jL 



A/D 



156 



FIG. 3 



BNSDOCID: <WO 9818261A1_L> 



WO 98/18261 



3/6 



t- CM 

o o 

OQ CQ 






"<- CM 

o o 

Q Q 


| j Q O O 






| J o o o 


video 


208 


video mixer 


control 202 






204 




208 




video 
modification 
214 





memory 
220 








processor 
222 





226 



FIG. 4 



BNSDOCID: <WO 981 8261 A1_l_> 



WO 98/18261 



• 



PCT/US97/16878 



4/6 



308' 



find 
target 
using ptz 



refine target 
position 



replacement 
graphic 



determine size 
& orientation 



I 



capture 
and digitize 



^ 300 



sense ptz 



determine fov 



yes 




302 



304 



no 



310 



312 



314 



account for occlusions 



316 



modify video 



318 




306 



FIG. 5 



BNSDOCID: <WO 981 8261 A1J_> 



WO 98/18261 





PCT/US97/16878 



5/6 



set up 



3> capture video 



compare 



350 



352 



354 



yes 



report ^— 

358 
no 




356 



360 



yes 



update 



362 



FIG. 8 



FIG. 6 



receive 
data 



450 



-SI 



determine if 
target found 



1 



determine 
position 



452 



454 



5L 



account for 
optical axis 



456 



BNSDOCID: <WO 981826'At J_> 



WO 98/18261 



• 



PCT/US97/16878 



6/6 




404 



406 



414- 



X-Y 
Counters 



-Y Code 
VJ~F 



Mem 
Control 



418 



Memory 



WE 



data 



416 



Processor 



408 



420 



FIG. 7 



BNSDOCID: <WO 981 8261 A1J_> 



INTERNA!* 



SEARCH REPORT 




appUcatioa No. 
/US97/16.78 



A. CLASSIFICATION OF SUBJECT MATTER 
IPC(6) :H04N 7/18 
US CL :34S/143, 157 

According to International Patent Clarification (IPC) or to both national classification and IPC 



FIELDS SEARCHED 



M inimum documentation searched (classification system followed by olaaaificatio* symbols) 
U.S. : 348/143, 151, 157-159, 169-172; 250/006.1, 206.2, 206.3; 352/53; 473/570, 5*8 



Documentation searched other than minimum < 



l to the extent that such document* are included in the fields searched 



Electronic data base consulted during the international search (name of data base and, where practicable, search terms used) 



DOCUMENTS CONSIDERED TO BE RELEVANT 



Category* 



Citation of document, with indication, where appropriate, of the relevant passages 



Relevant to claim No. 



A 
A 



US 5,465,144 A (PARKER ET AL) 07 November 1995, FIGS. 1 
AND 2, COLS. 1, 5-6. 

US 5,508,737 A (LANG) 16 April 1996, FIGS. 1 AND 4, COLS. 
3-4. 

US 5,564,698 A (HONEY ET AL) 15 October 1996. 
US4,067,015 A (MOGAVERO ET AL) 03 January 1978. 
US 4,064,528 A (BOWERMAN) 20 December 1977. 



1-8, 10-17, 22-30, 
34-36. 

9, 18-21, 31-33 



1-36 
1-36 
1-36 



| | Further documents are listed in the continuation of Box C. See patent family annex. 



•o- 
.p. 



Spoeaal aatagoriao of oatod doounonta: 

iIimw— I dafining th« gonorai atata of tho art wbioh aMtooaiidwtd 
to bm of particular w l wi aw 

oirliar ilnwirt puoiiahod on or a ft r tho naarMoUoawi filing data 

dootaant which m*y throw Aavkm cm priority «Wim(a) or which 
ortod to aatohliah tha pwblioa t ioai date oi aa mth ar e**t*uo or <*■ 
spocial nm am (as apooifiod) 

L lafariif to m or 



l«Ur dooMSta* p»ptiah«d aftar tfaa intaroo fiooal filing data or priority 
date Md not at aoofhot with the ■pwlio t kwi hut cited to undantaod 
tlw pr hMipU or awory taftdarh/ing tho iawootoon 

doouaioat of pftimitor rah»Yaa>oa: th« okitead invantion cannot be 
•oaaidarod m**I or aoe»ot bo ooaaidarod to ksvohr • an invaatW* mp 
L m tokon aiooa 



doouaaattt of particular r»Uraoo»; tho aaaiaiad kirosboa ooonot be 
ootMidorod to hwokr* an hi i— ti ro atop when tho dooua*ant t* 



dootamontowbliaHad prior to tfa* 
tho priority date ohuaiod 



fdaa* date hut later thaai 



boiag oawr i awa to a 
dopant* aioaiwar of the aaaao 



i at 
amity 



Date of the actual completion of the international search 
21 NOVEMBER 1997 



Date of fining of the international search report 

23 JAN 1998 



Name and mailing address of the ISA/US 
Commissioner of Patents and Trademarks 

Box PCT 

Washington, D.C 20231 
Facsimile No. (703) 305-3230 



Authorized officer , \ 
Telephone No. (703) 308-6612 



Form PCT/ISA/210 (second sheetKiuly 1992)* 



BNSDOCID: <WO 9816261A1_I_> 



